81
In any case, it is advisable to always take a look at the quality parameters of the data
bases first in order to be able to assess the actual information content and the usability of
the information provided for one’s own scientific work and resulting statements.
6.2
Users should therefore always take a look at the quality parameters before using the
information provided.
So with that, we understand how bioinformatics now works so quickly and soundly.
There are fast and yet surprisingly accurate programs used (heuristics). And there are
good, highly sophisticated databases where you can trust the entries and yet they are very
well maintained.
Therefore, a few other notable heuristics should be mentioned here. Besides BLAST
sequence search, BLAT search is another speedup, as is Mega-BLAST (the expert then
knows what is more easily overlooked by these variants of BLAST).
Even 3-D structures are made faster and shorter by heuristic searches. In particular,
many reasonably fast modeling programs use the homology modeling step, that is, using
known structures to model the unknown structure if it is sufficiently similar. This heuristic
is not an exact model and assumes that the new structure is too similar to something. The
heuristic is even more stringent in threading. Here it is assumed that even an unknown 3-D
structure can be predicted by combining and testing known 3-D structures. To do this, the
unknown structure is threaded onto the known 3-D structures on the basis of the sequence.
One then calculates which region is best covered by which known structure. Not exact, just
a heuristic.
One can be surprised at the protein interaction database STRING (EMBL) how quickly
the interactions are calculated. A trick is used that is also used by a number of other data
bases. Here, all interactions are calculated in many weeks with each update of the data
base. The single database query now only looks up where the best entry for the query is
located in the database. If one or more sequences are entered, this is done via a sequence
comparison (with BLAST), if a keyword is entered, this is done via a fast text search.
Metabolic models often make the heuristic assumption of steady-state equilibrium and
then calculate the underlying enzyme chains for this equilibrium (flux balance analysis;
the same principle used: elementary mode analyses). Even if, for example, YANAsquare
calculates flux strengths, it makes the simplified assumption that gene expression data
6.2 Maintenance of Databases and Acceleration of Programs